Communication device and control program for communication device

ABSTRACT

A communication device includes a calculation unit which calculates class probabilities that are probabilities an input speech belongs to a plurality of respective classified classes previously defined as types of speech contents, a plurality of response generation modules provided for respective types of responses each generates a response speech corresponding to the type, a determination unit which selects one of the plurality of response generation modules based on association probabilities and the class probabilities calculated by the calculation unit and determines the response speech generated by the selected response generation module as an output speech to be emitted to the user, the association probabilities being set for each of the plurality of response generation modules, and the association probabilities each indicating a level of association between the response generation module and each of the plurality of classified classes, and an output unit which outputs the output speech.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is based upon and claims the benefit of priority fromJapanese patent application No. 2018-200832, filed on Oct. 25, 2018, thedisclosure of which is incorporated herein in its entirety by reference.

BACKGROUND

The present disclosure relates to a communication device and a controlprogram for the communication device.

There is a known technique of analyzing a user's speech to recognize asemantic content, generating a response speech according to the type ofthe user's speech, and presenting it to the user by a voice or a text(see, for example. Japanese Unexamined Patent Application PublicationNo. 2010-140282).

SUMMARY

In a speech response device of the related art, speeches in response tothe user's speech become uniform, enabling a user to predict theresponse speech, to some extent as he/she uses the device. That is, theuser cannot get a feeling of conversing with a conversation partnerhaving life or free will from the response speech of the speech responsedevice, and thus may get bored with having a dialogue therewith.

The present disclosure provides a communication device or the like thatcan be recognized by a user as a conversation partner by generating avariety of response speeches.

A first example aspect of the present disclosure is a communicationdevice including: an input unit configured to input an input speech thatis a user's speech; a calculation unit, configured to calculate classprobabilities that are probabilities the input speech belongs to aplurality of respective classified classes previously defined as typesof speech contents; a plurality of response generation modules providedfor respective types of responses each configured to generate a responsespeech corresponding to the type, a determination unit configured toselect one of the plurality of response generation modules based onassociation probabilities and the class probabilities calculated by thecalculation unit and determine the response speech generated by theselected response generation module as an output speech to be emitted tothe user, the association probabilities being set for each of theplurality of response generation modules, and the associationprobabilities each indicating a level of association between theresponse generation module and each of the plurality of classifiedclasses; and an output unit configured to output the output speech. Withthe communication device having such a configuration, the output speechis determined the multiplication of the class probability by theassociation probability, the selection variation of the output speechfor the input speech is increased, so that the dialogs can be diverseand unexpected.

In the above communication device, the determination unit may beconfigured to randomly select one response generation module having aselection probability, which is obtained by multiplying the associationprobability by the class probability, greater than or equal to areference value from among the plurality of response generation modules.Such a configuration can make the dialogue more unexpected.

The determination unit may be configured to multiply a past coefficient,which is set in such a way that a probability of selecting apreviously-selected response generation module becomes low, by theassociation probability and select one of the plurality of responsegeneration modules. Such a configuration can effectively prevent similarresponse speeches from being output.

Further, in the above communication device, the determination unit mayselect one of the plurality of response generation modules, and then theselected response generation module may generate the response speech.When the response generation module generates the response speech afterit is selected, it is possible to save unnecessary processing such thatthe unselected response generation module generates the response speech.

A second example aspect of the present disclosure is a control programof a communication device that causes a computer to execute: inputtingan input speech that is a user's speech; calculating class probabilitiesthat are probabilities the input speech belongs to a plurality ofrespective classified classes previously defined as types of speechcontents; selecting one of the plurality of response generation modulesbased on association probabilities and the class probabilitiescalculated in the calculating and determining the response speechgenerated by the selected response generation module as an output speechto be emitted to the user, the association probabilities being set foreach of the plurality of response generation modules, and theassociation probabilities each indicating a level of association betweenthe response generation module and each of the plurality of classifiedclasses; and outputting the output speech. With a communication devicecontrolled by such a control program, the output speech is determinedthe multiplication of the class probability by the associationprobability, the selection variation of the output speech for the inputspeech is increased, so that the dialogs can be diverse and unexpected.

According to the present disclosure, it is possible to provide acommunication device or the like that can be recognized by a user as aconversation partner by generating a variety of response speeches.

The above and other objects, features and advantages of the presentdisclosure will become more fully understood from the detaileddescription given hereinbelow and the accompanying drawings which aregiven by way of illustration only, and thus are not to be considered aslimiting the present disclosure.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a view showing an example of communication between a robot anda user according to a first embodiment;

FIG. 2 is a system configuration diagram of the robot;

FIG. 3 is an example of a reference table which defines associationprobabilities;

FIG. 4 is a flowchart showing processing from receiving a user's speechto responding to it;

FIG. 5 is a flowchart showing a process of selecting a responsegeneration module;

FIG. 6 is a view showing an example of communication between a robot anda user according to a second embodiment;

FIG. 7 is a system configuration diagram of the robot and a server; and

FIG. 8 is a view showing a smartphone according to a third embodiment.

DETAILED DESCRIPTION

FIG. 1 is a view showing an example of communication between a robot 100and a user according to a first embodiment. The robot 100 is acommunication device that conducts a voice dialogue with a human beingwho is a user. The robot 100 is a character device that embodies acharacter, and may be configured to change an eye expression and aline-of-sight direction according to the dialogue.

The robot 100 imitates an appearance of an animal and has a head 110 anda body 120. A microphone 101 is disposed in a hidden manner at anyposition of the head 110. The microphone 101 has a function as an inputunit that inputs a user's spoken voice as an input speech. A speaker 102is disposed in a hidden manner at a position corresponding to the mouthof the robot 100. The speaker 102 has a function as an output unit thatemits a voice generated by the robot 100. The user feels as if the robot100 is talking from the voice output from the position of the mouth. Asshown in the drawing, for example, when the user speaks to the robot 100“What will the weather be like today?”, then the robot 100 speaks inresponse, for example, “It will be sunny followed by clouds.”

FIG. 2 is a system configuration diagram of the robot 100. The mainsystem configuration of the robot 100 includes the microphone 101, thespeaker 102, a control unit 200, a speech database 210, a knowledgedatabase 220, and a memory 230. The control unit 200 is composed of, forexample, a CPU. The control unit 200 also operates as a functionexecution unit responsible for execution of each function, and mainlyoperates as a speech analysis unit 201, a class probability calculationunit 202, a generation module determination unit 203, a speech controlunit 204, and a response generation module group 205.

A main function of the microphone 101 is to collect voices spoken by theuser who is a partner the robot 100 has a dialogue with. The microphone101 converts the collected user's spoken voice into a voice signal, andpasses the voice signal to the speech analysis unit 201 as the inputspeech from the user.

The speech analysis unit 201 analyzes the input speech received from themicrophone 101 and converts it into a text, and recognizes the user'sspeech content. Specifically, the speech analysis unit 201 recognizesthe user's speech content using common voice recognition technology. Forexample, a word analysis or the like is performed on the text of theinput speech, and the speech content is recognized using a DNN model ora logistic regression model. The speech analysis unit 201 passes therecognized speech content to the class probability calculation unit 202and the response generation module group 205.

The class probability calculation unit 202 calculates a classprobability that is a probability that the input speech received fromthe speech analysis unit 201 belongs to each of a plurality ofclassified classes previously defined as types of speech contents. Inthis embodiment, the type of input speech is classified into four of“question”, “information provision”, “request”, and “non-dialogue”. Eachof these four classifications is referred to as a classified class, andthe class probability calculation unit 202 calculates the classprobabilities as estimated probabilities that the input speech belongsto the “question class”, “information provision class”, “request class”,and “non-dialogue class”.

For example, the class probability of the question class is aprobability that a content of the input speech is estimated that theuser wants to know something. For example, when the input speech is“What will the weather be like today?” it is assumed that the user wantsto know the weather today, so the class probability of the questionclass becomes a large value. The class probability of the informationprovision class is the probability that the content of the input speechis estimated to be something the user wants to convey. For example, whenthe input speech is “I hate vegetables”, it is estimated that the userwants the robot 100 to know his/her characteristics, so the classprobability of the information providing class becomes a large value.

The class probability of the request class is the probability that thecontent of the input speech is estimated that the user wants the robotto do something. For example, when the input speech is “Turn on thelight of the living room”, it is estimated that the user wants the robot100 to transmit a control signal to turn on the light of the livingroom, so the class probability of the request class becomes a largevalue. The class probability of the non-dialogue class is theprobability that the content of the input speech is estimated that it isnot directed to the robot 100. For example, when the input speech is“oh, I'm sleepy”, it is estimated that the user is speaking byhimself/herself, so the class probability of the non-dialogue classbecomes a large value.

The class probability calculation unit 202 calculates the classprobabilities with reference to the knowledge database 220. Theknowledge database 220 is composed of, for example, a recording mediumof a hard disk drive, and, stores many words, their attributes, ananalysis grammar that defines a dependency relationship between words,etc. The knowledge database 220 may not be incorporated into the robot100, and instead may be connected to, for example, a network to whichthe robot 100 can be connected. The class probability calculation unit202, for example, refers to the knowledge database 220 to determinewords to be excluded from words to be considered, from among a pluralityof words included in the input speech based on the number of wordshaving the same attribute, the types of the attribute, and the analysisgrammar. Then, the class probability calculation unit 202 calculates theclass probability in accordance with a predetermined calculationformula. For example, the class probability calculation unit 202outputs, in response to the input speech “What will the weather be liketoday?”, results such as the question class probability 70%, theinformation provision class probability 5%, the request classprobability 10%, and the non-dialogue class 15%. When the classprobability calculation unit 202 outputs the class probabilities thatthe input speech belongs to the respective classified classes, it passesthe class probabilities to the generation module determination unit 203.

Note that, instead of the analytical calculation method using theknowledge database 220, a calculation method by artificial intelligenceusing logistic regression or DNN (Deep Neural Network) may be employed.In this case, a learned model for outputting the class probabilitiesthat the input speech belongs to the respective classified classes whenthe input speech is given thereto may be prepared. Each time the classprobability calculation unit 202 receives the input. Speech from thespeech analysis unit 201. It calculates the class probability using thelearned model.

The response generation module group 205 is a collection of responsegeneration modules that generate the response speeches corresponding tothe set response types, in this embodiment, five types of the responsetypes are set in advance, which are “question response”, “associativeresponse”, “example response”, “sympathetic response”, and “imitationresponse”. A question response generation module 205 a, an associativeresponse generation module 205 b, an example response generation module205 c, a sympathetic response generation module 205 d, and an imitationresponse generation module 205 e are prepared as the response generationmodules for generating the response speeches that match the respectiveresponse types.

The response question is a response type that returns an answer to thequestion. For example, when the input speech, is “I wonder if it rainstomorrow”, the question response generation module 205 a generates theresponse speech of “It will be sunny followed by clouds”. An associativeresponse is a response type that returns a phrase associated with aninput sentence. For example, when the input speech is “I wonder if itrains tomorrow”, the associative response generation module 205 bgenerates the response speech, “Be careful not to catch a cold”.

The example response is a response type that returns a phrase close tothe input speech. For example, when the input speech is “I wonder if itrains tomorrow”, the example response generation module 205 c generatesa response speech of “it is nice weather today, isn't it”. Thesympathetic response is a response type that returns a phrase that isconsiderate of the user's emotion included in the input speech. Forexample, when the input speech is “I wonder if it rains tomorrow”, thesympathetic response generation module 205 d does not generate aresponse speech, because the input speech does not include a word havingan attribute of emotion. The imitation response is a response type thatimitates a part or all of the input speech and gives a parrot-likeresponse. For example, when the input speech is “I wonder if it rainstomorrow”, the imitation response generation module 205 e generates theresponse speech of “Tomorrow?”

Each response generation module refers to the speech database 210 togenerate the response speech matching the response type. The speechdatabase 210 is composed of, for example, a recording medium of a harddisk drive, and individual terms organized into a corpus are storedtherein with reproducible speech data. The speech database 210 may notbe incorporated into the robot 100, and instead may be connected to, forexample, a network to which the robot 100 can be connected.

The generation module determination unit 203 selects one responsegeneration module from the response generation module group 205 based onthe class probabilities received from the class probability calculationunit 202 and association probabilities obtained by reading a referencetable 221 stored in the memory 230. The specific selection method willbe described later. The generation module determination unit 203acquires the response speech generated by the selected responsegeneration module from the selected response generation module, anddetermines to employ it as an output speech.

The speech control unit 204 converts the received response speechsentence into a voice signal and passes the voice signal to the speaker102. The speaker 102 receives the voice signal converted by the speechcontrol unit 204, and outputs an output speech as a voice. The memory230 is a no storage medium such as a flash memory. The memory 230stores. In addition to the reference table 231, a control program forcontrolling the robot 100, various parameter values used for control andcalculation, functions, and lookup tables, etc.

FIG. 3 is an example of the reference table 231 which defines theassociation probabilities. The association probability is a value thatis set for each of the response generation modules and indicates anassociation level for each of the above-described classified classes.For example, for the question response generation module 205 a, theassociation probability with the question class is defined as 70%, theassociation probability with the information provision class is definedas 15%, the association probability with the request class is defined as10%, and the association probability with the non-dialogue class isdefined as 5%. Likewise, for each of the associative response generationmodule 205 b, the example response generation module 205 c, thesympathetic response generation module 205 d, and the imitation responsegeneration module 205 e, the association probability with the questionclass, the association probability with the information provision class,the association probability with the request class, and the associationprobability with the non-dialogue class are defined.

The generation module determination unit 203 calculates a selectionprobability by multiplying the class probability of each classifiedclass received from the class probability calculation unit 202 by eachassociation probability of the reference table 231. For example, theselection probabilities of the question response generation module 205 afor the class probabilities calculated as the question class probability50%, the information provision class probability 25%, the request classprobability 10%, and the non-dialogue class probability 15% when theprobability to be calculated is expressed by P (response generationmodule|classified class) are:P(question response|question)=70%×50%=35%P(question response|information provision)=15%×25%=3.75%P(question response|request)=10%×10%=1%P(question response|non-dialogue)=5%×15%=0.75%.Likewise, the selection probability of the associative responsegeneration module 205 b will be:P(associative response|question)=10%×50%=5%P(associative response|information provision)=40%×25%=10%P(associative response|request)=20%×10%=2%P(associative response|non-dialogue)=30%×15%=4.5%.The selection probability of the example response generation module 205c, the selection probability of the sympathetic response generationmodule 205 d, and the selection probability of the imitation responsegeneration module 205 e are calculated in a manner similar to the above.

The generation module determination unit 203 searches for the selectionprobability (P (question response|question)=35% in the above example)which is the largest value among the selection probabilities calculatedin this way, and selects the response corresponding to this value (inthe above example, the question response generation module 205 a). Then,the generation module determination unit 203 acquires the responsespeech generated by the selected response generation module (e.g., “Itwill be sunny followed by clouds”), and uses this response speech as theoutput speech.

When the selected response generation module does not generate aresponse speech, the response generation module indicating the nextlargest selection probability is selected, and the response speechgenerated by this response generation module is used as the outputspeech. When there are a plurality of selection probabilities having thesame largest value, the generation module determination unit 203 mayrandomly select one of the response generation modules corresponding tothese largest values.

According to such a method of calculating the selection probability anddetermining the output speech, the selection variation of the outputspeech for the input speech is increased, so that the dialogs can bediverse and unexpected. That is, even a small difference in theexpression to talk to could make voices returned from the robot 100different, thereby reducing the possibility that the user gets boredwith the dialogue soon. It is particularly effective in chatty dialogueswhere the conversational ball is kept rolling, because diversity andunexpectedness become the core factors in continuing the dialogue.

In order to further exert the diversity and unexpectedness, thegeneration module determination unit 203 may extract the selectionprobability having a predetermined reference value or greater from thecalculated selection probabilities, and randomly select one of theresponse generation modules corresponding to the extracted selectionprobability. For example, when the reference value of the selectionprobability is set to P₀=35%, if the selection probability in which P>P₀holds appears in the question response generation module 205 a, theexample response generation module 205 c and the sympathetic responsegeneration module 205 d, the generation module determination unit 203randomly selects one of these three generation modules.

When dialogues are continuously conducted, a calculation may beperformed in such a way that the selection probability of the responsegeneration module already selected in the series of dialogues becomeslow. For example, the selection probability is calculated aftermultiplying the association probability by a past coefficient (number 0or greater and less than 1) which changes depending on a frequencyselected in the past or whether it has been selected the last time. Inthis way, when the selection probability of the already selectedresponse generation module is calculated to be low, it is possible toavoid similar responses from being output.

Next, a flow of processing performed by the control unit 200 from thereception of the user's speech to the response thereto will bedescribed. FIG. 4 is a flowchart showing processing from the receptionof the user's speech to the response thereto. The flow of FIG. 4 showsprocessing from the user speaking one phrase to the robot 100 returningone phrase.

In Step S101, when the control unit 200 acquires the user's speech viathe microphone 101, the speech analysis unit 201 serving as a functionalblock analyzes and recognizes the user's speech as the input speech inStep S102. The speech analysis unit 201 passes the recognized speechcontent to the class probability calculation unit 202 and the responsegeneration module group 205.

In Step S103, the class probability calculation unit 202 calculates aclass probability which is a probability that the input speech belongsto each of the classified classes. When the class probabilitycalculation unit 202 calculates the class probabilities that the inputspeech belongs to the respective classified class, the class probabilitycalculation unit 202 passes the values of the calculated classprobabilities to the generation module determination unit 203.

In Step S104, the generation module determination unit 203 reads thereference table 231 from the memory 230 and acquires the associationprobability for each classified class of each response generationmodule. Then, in Step S105, one response generation module is determinedfrom the response generation module group 205. The specific process flowof Step S105 will be described using FIG. 5.

FIG. 5 is a sub-flow diagram showing a process of selecting the responsegeneration module. In Step S1051, the generation module determinationunit 203 first calculates the past coefficient. The past coefficient iscalculated for each response generation module, and increases ordecreases depending on the frequency at which the response generationmodule to be calculated is selected in the past and whether it has beenselected the last time. The generation module determination unit 203proceeds to Step S1052 where it searches for the selection probability Pgreater than the reference value P₀ from among the selectionprobabilities P obtained by multiplying the past coefficient, theassociation probability, and the class probability, and extracts theresponse generation module corresponding to the selection probability.

Then, in Step S1053, one of the plurality of extracted responsegeneration modules is randomly selected. When there is only oneselection probability P greater than the reference value P₀, theresponse generation module corresponding to the selection probability isselected. When there is no selection probability P greater than thereference value P₀, the response generation module corresponding to theselection probability having the largest value is selected.

Returning to the flow of FIG. 4, the description is continued. In StepS106, each response generation module of the response generation modulegroup 205 receives the speech content recognized by the speech analysisunit 201, and generates a response speech that matches the response typeof the corresponding response generation module. Step S106 may beperformed in parallel to Steps S103 to S105, or may be performed beforeStep S102 or after Step S105.

The generation module determination unit 203 proceeds to Step S107 whereit checks whether the response generation module selected in Step S105has generated the response speech. When the response speech has not beengenerated (Step S107: NO), the process proceeds to Step S108, and theresponse generation module is reselected. For example, as describedabove, the response generation module having the next largest selectionprobability is selected. Alternatively, the generation moduledetermination unit 203 may randomly select the response generationmodule from the remaining response generation modules.

When the response generation module selected in Step S105 has generatedthe response speech (Step S107: Yes), the process proceeds to Step S109where the response speech is acquired and used as the output speech. InStep S110, the speech control unit 204 converts the output speechreceived from the generation module determination unit 203 into thevoice signal and controls the speaker 102 to emit it. Then, a series ofprocesses is completed. If there is another speech from the user again,the processes are repeated in a manner similar to the above. In theabove process flow, an example in which all response generation modulesgenerate the response speeches has been described. Alternatively, onlythe response generation module selected by the generation moduledetermination unit 203 may generate the response speech after theselection. In this case, “GENERATE RESPONSE SPEECH” of Step S106 isperformed after Step S105. When the selected response generation modulegenerates the response speech, after it is selected, it is possible tosave unnecessary processing such that the unselected response generationmodule generates the response speech. On the other hand, when eachresponse generation module generates the response speech prior to theselection by the generation module determination unit 203, a quickresponse is achieved. These specifications may be determined accordingto the environment in which the robot 100 is used.

Next, a second embodiment will be described. FIG. 6 is a diagram showingan example of communication between a robot and a user according to thesecond embodiment, in the first embodiment, a main body includes all themain functional elements so that the robot 100 can communicate with theuser alone. However, a robot 100′ according to the second embodimentemploys a configuration in which a server 300 is responsible forfunctional elements related to calculation.

For example, when the user speaks to the robot 100′, “What will theweather be like today?”, the microphone of the robot 100′ captures thevoice. The robot 100′ converts the captured voice into a voice signaland transmits the voice signal to the server 300 by radio communication.The server 300 selects the voice data of the response voice (in theexample of the drawing, “It will be sunny followed by clouds,” usingthese pieces of information, and transmits it to the robot 100′. Therobot 100′ emits a voice corresponding to the received voice data fromthe speaker 102.

FIG. 7 is a system configuration diagram of the robot 100′ and theserver 300. Components responsible for the same functions as those ofthe components described in the first embodiment in principle aredenoted by the same signs, and the description of such functions isomitted. In this embodiment, the server 300 functions as an entity of acommunication device that executes various operations, etc.

The robot 100′ includes the microphone 101 and the speaker 102 in, thesame manner as in the robot 100. The control unit 190 converts a voicesignal received from the microphone 101 into voice data, and transmitsthe voice data to the server 300 via the communication unit 191.Further, the control unit 190 converts the voice data received via thecommunication unit 191 into a voice signal, and controls the speaker 102to emit it. The communication unit 191 is a communication interface forexchanging the control signals and voice data with the server 300 via anetwork, and is, for example, a wireless LAN unit.

Like the robot 100, the server 300 includes the control unit 200, thespeech database 210 the knowledge database 220, and the memory 230.Further, the communication unit 291 is a communication interface forexchanging the control signals and voice data with the robot 100 via thenetwork. The communication unit 290 is, for example, a wireless LANunit.

The speech analysis unit 201 receives the user speech as the inputspeech via the communication unit 291. Further, the speech control unit204 passes the voice data of the output speech received from thegeneration module determination unit 203 to the communication unit 291.

Like the first embodiment, such a system configuration according to thesecond embodiment can achieve communication with the user. Further, byintegrating the functions related to the calculation in the server 300,the configuration of the robot 100′ can be simplified, and smoothcommunication can be possible even without including a high-performancecontrol chip in the robot 100′. Further, when the server 300 isresponsible for the functions related to the calculation, it cansequentially respond to a calculation request from the plurality ofrobots 100′, thereby reducing the manufacturing cost of the entiresystem.

Next, a third embodiment will be described. FIG. 8 is a diagram showinga smartphone 700 according to the third embodiment. In the first andsecond embodiments, the robots 100 and 100′ embodying the character arepartners which the user has a dialogue with. However, in the thirdembodiment, a video character 800 displayed on the smartphone 700 is apartner which the user has a dialogue with. When the robot embodies acharacter, the user would feel it like a pet and be more attached to it.The smartphone 700 can more easily express the character.

A system configuration of the smartphone 700 is almost the same as thesystem configuration of the robot 100 according to the first embodimentdescribed with reference to FIG. 2. A configuration of the smartphone700 different from that of the robot 100 will be described below, anddescriptions of the common configuration is omitted.

The smartphone 700 includes a display panel 710, a microphone 711, and aspeaker 712. The display panel 710 is, for example, a liquid crystalpanel, and displays the video character 800. The microphone 711 is anelement that replaces the microphone 101 in the first embodiment, andcollects the user's spoken voices. The speaker 712 is an element thatreplaces the speaker 102 in the first embodiment, receives a voicesignal converted by the speech control unit 204, and outputs an outputspeech.

Further, an input text window 721 slowing an input speech which is theuser's speech is displayed on the display panel 710. The speech analysisunit 201 converts the input speech into a text to generate the inputtext window 721 and displays it on the display panel 710. Further, anoutput text window 722 showing the output speech, which is the selectedresponse speech, in a text is displayed on the display panel. The speechcontrol unit 204 converts the response speech sentence into a text togenerate an output text window 722 and displays it on the display panel710.

By displaying the input speech and output speech in text information inthis manner, it is possible to visually confirm the dialogue. Further,when the voice output is turned off, the user can enjoy communication ina quiet environment without bothering people in the surroundings.Moreover, when the input speech is provided in a text using a text inputfunction of the smartphone 700 instead of providing the input speech ina voice, the user can enjoy communication without using voices. In thiscase, both the input speech and the output speech are processed as thetext information.

When the smartphone 700 is made to function as a communication device inthis way, the user can more easily enjoy the dialogue with thecharacter, because dedicated hardware serving as the communicationdevice is not necessary. Moreover, when the smartphone 700 is configuredin, such a way that the user can have a dialogue with the videocharacter 800 in conjunction with another application of the smartphone700, it can be incorporated into various applications. The smartphone700 may be a system in which a server cooperates with the smartphone asin the second embodiment.

The program can be stored and provided to a computer using any type ofnon-transitory computer readable media. Non-transitory computer readablemedia include any type of tangible storage media, Examples ofnon-transitory computer readable media include magnetic storage media(such as floppy disks, magnetic tapes, hard disk drives, etc.), opticalmagnetic storage media (e.g. magneto-optical disks), CD-ROM (compactdisc read only memory), CD-R (compact disc recordable). CD-R/W (compactdisc rewritable), and semiconductor memories (such as mask ROM, PROM(programmable ROM), EPROM (erasable PROM), flash ROM, RAM (random accessmemory), etc.). The program may be provided to a computer using any typeof transitory computer readable media. Examples of transitory computerreadable media include electric signals, optical signals, andelectromagnetic waves. Transitory computer readable media can providethe program to a computer via a wired communication line (e.g. electricwires, and optical fibers) or a wireless communication line.

From the disclosure thus described, it will be obvious that theembodiments of the disclosure may be varied in, many ways. Suchvariations are not to be regarded as a departure from the spirit andscope of the disclosure, and all such modifications as would be obviousto one skilled in the art are intended for inclusion within the scope ofthe following claims.

What is claimed is:
 1. A communication device comprising: an input unitconfigured to input an input speech that is a user's speech; acalculation unit configured to calculate class probabilities that areprobabilities the input speech belongs to a plurality of respectiveclasses previously defined as types of speech contents; a plurality ofresponse generation modules provided for respective types of responses,each of the plurality of response generation modules configured togenerate a response speech corresponding to one of the types ofresponses; a determination unit configured to select one of theplurality of response generation modules based on associationprobabilities and the class probabilities calculated by the calculationunit and determine response speech generated by the selected responsegeneration module as an output speech to be emitted to the user, theassociation probabilities being set for each of the plurality ofresponse generation modules, and each of the association probabilitiesindicating a level of association between one of the response generationmodules and one of the plurality of classes; and an output unitconfigured to output the output speech.
 2. The communication deviceaccording to claim 1, wherein the determination unit is configured torandomly select one response generation module having a selectionprobability, which is obtained by multiplying one of the associationprobabilities by one of the class probabilities, greater than or equalto a reference value from among the plurality of response generationmodules.
 3. The communication device according to claim 1, wherein thedetermination unit is configured to multiply a past coefficient, whichis set in such a way that a probability of selecting apreviously-selected response generation module becomes low, by one ofthe association probabilities and select one of the plurality ofresponse generation modules.
 4. The communication device according toclaim 1, wherein the selected response generation module is configuredto generate, after the determination unit selects the one of theplurality of response generation modules, response speech.
 5. Anon-transitory computer readable medium storing a control program of acommunication device, the control program, when executed by a computer,causing the computer to execute: inputting an input speech that is auser's speech; calculating class probabilities that are probabilitiesthe input speech belongs to a plurality of respective classes previouslydefined as types of speech contents; selecting one of a plurality ofresponse generation modules based on association probabilities and theclass probabilities calculated in the calculating, generating a responsespeech by the selected response generation module, and determining theresponse speech generated by the selected response generation module asan output speech to be emitted to the user, the associationprobabilities being set for each of the plurality of response generationmodules, and each of the association probabilities indicating a level ofassociation between one of the response generation modules and one ofthe plurality of classes; and outputting the output speech.